Verbal chunk extraction in French using limited resources
نویسندگان
چکیده
A way of extracting French verbal chunks, inflected and infinitive, is explored and tested on effective corpus. Declarative morphological and local grammar rules specifying chunks and some simple contextual structures are used, relying on limited lexical information and some simple heuristic/statistic properties obtained from restricted corpora. The specific goals, the architecture and the formalism of the system, the linguistic information on which it relies and the obtained results on effective corpus are presented.
منابع مشابه
0 40 80 60 v 1 2 6 A ug 2 00 4 Verbal chunk extraction in French using limited resources ∗
A way of extracting French verbal chunks, inflected and infinitive, is explored and tested on effective corpus. Declarative morphological and local grammar rules specifying chunks and some simple contextual structures are used, relying on limited lexical information and some simple heuristic/statistic properties obtained from restricted corpora. The specific goals, the architecture and the form...
متن کاملResources and Techniques for Multilingual Information Extraction
Official travel warnings published regularly in the internet by the ministries for foreign affairs of France, Germany, and the UK provide a useful resource for assessing the risks associated with travelling to some countries. The shallow IE system SProUT has been extended to meet the specific needs of delivering a language-neutral output for English, French, or German input texts. A shared type...
متن کاملExtraction of Definitional Contexts using Lexical Relations
In this paper we present a method for automatically extracting definitional contexts from restricted domains in Spanish. Definitional contexts are textual fragments where there is an implicit definition that can be identified by taking into account verbal patterns linking a term and its corresponding definition. Our interest is in definitional contexts with analytical definitions. Therefore, we...
متن کاملTRACX: a recognition-based connectionist framework for sequence segmentation and chunk extraction.
Individuals of all ages extract structure from the sequences of patterns they encounter in their environment, an ability that is at the very heart of cognition. Exactly what underlies this ability has been the subject of much debate over the years. A novel mechanism, implicit chunk recognition (ICR), is proposed for sequence segmentation and chunk extraction. The mechanism relies on the recogni...
متن کاملEvent Nominals: Annotation Guidelines and a Manually Annotated Corpus in French
LIMSI-CNRS & Univ. Paris-Sud 91403 Orsay, France [email protected] Abstract Within the general purpose of information extraction, detection of event descriptions is an important clue. A word referring to an event is more powerful than a single word, because it implies a location, a time, protagonists (persons, organizations. . . ). However, if verbal designations of events are well stu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره cs.CL/0408060 شماره
صفحات -
تاریخ انتشار 2004